Knowledge Lean Word-Sense Disambiguation

نویسندگان

  • Ted Pedersen
  • Rebecca F. Bruce
چکیده

We present a corpus based approach to word sense disambiguation that only requires information that can be automatically extracted from untagged text We use unsupervised techniques to estimate the pa rameters of a model describing the conditional distri bution of the sense group given the known contextual features Both the EM algorithm and Gibbs Sampling are evaluated to determine which is most appropriate for our data We compare their disambiguation ac curacy in an experiment with thirteen di erent words and three feature sets Gibbs Sampling results in small but consistent improvement in disambiguation accu racy over the EM algorithm

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Knowledge-Rich Word Sense Disambiguation Rivaling Supervised Systems

One of the main obstacles to highperformance Word Sense Disambiguation (WSD) is the knowledge acquisition bottleneck. In this paper, we present a methodology to automatically extend WordNet with large amounts of semantic relations from an encyclopedic resource, namely Wikipedia. We show that, when provided with a vast amount of high-quality semantic relations, simple knowledge-lean disambiguati...

متن کامل

Demo: Enriching Text with RDF/OWL Encoded Senses

This demo paper describes an extension of the Enrycher text enhancement system, which annotates words in context, from a text fragment, with RDF/OWL encoded senses from WordNet and OpenCyc. The extension is based on a general purpose disambiguation algorithm which takes advantage of the structure and/or content of knowledge resources, reaching state-of-the-art performance when compared to other...

متن کامل

6 Unsupervised corpus - based methods for WSD

This chapter focuses on unsupervised corpus-based methods of word sense discrimination that are knowledge-lean, and do not rely on external knowledge sources such as machine readable dictionaries, concept hierarchies, or sense-tagged text. They do not assign sense tags to words; rather, they discriminate among word meanings based on information found in unannotated corpora. This chapter reviews...

متن کامل

6 Unsupervised Corpus - Based Methods for WSD 6 . 1

This chapter focuses on unsupervised corpus-based methods of word sense discrimination that are knowledge-lean, and do not rely on external knowledge sources such as machine readable dictionaries, concept hierarchies, or sense-tagged text. They do not assign sense tags to words; rather, they discriminate among word meanings based on information found in unannotated corpora. This chapter reviews...

متن کامل

EBL-Hope: Multilingual Word Sense Disambiguation Using a Hybrid Knowledge-Based Technique

We present a hybrid knowledge-based approach to multilingual word sense disambiguation using BabelNet. Our approach is based on a hybrid technique derived from the modified version of the Lesk algorithm and the Jiang & Conrath similarity measure. We present our system's runs for the word sense disambiguation subtask of the Multilingual Word Sense Disambiguation and Entity Linking task of SemEva...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997